Skip to content

I/O virtual memory (IOMMU) support #327

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 12 commits into
base: main
Choose a base branch
from
Open

Conversation

XanClic
Copy link

@XanClic XanClic commented May 30, 2025

Summary of the PR

This MR adds support for an IOMMU, and thus I/O virtual memory handling.

New Memory Trait: IoMemory

Handling I/O virtual memory requires a new interface to access guest memory:
GuestMemory does not allow specifying the required access permissions, which
is necessary when working with MMU-guarded memory.

We could add memory access methods with such a permissions parameter to
GuestMemory, but I prefer to provide a completely new trait instead. This
ensures that users will only use the interface that actually works when working
with (potentially) I/O virtual memory, i.e.:

  • They must always specify the required permissions,
  • They cannot (easily) directly access the memory regions, because doing so
    generally assumes that regions are long, continuous, and any address in a
    given range will be in the same memory region. This is absolutely no longer
    the case with virtual memory, which is heavily fragmented into pages.

That is, adding a new trait (IoMemory) allows to catch a lot of potential
mistakes at compile time, which I feel is much better than finding out at
runtime that some place forgot to specify the access permissions.

Unfortunately, this is an incompatible change because we need to decide on a
single guest memory trait that we expect users to primarily use: We can only
have one blanket implementation of e.g. Bytes, and this MR changes that
blanket implementation to be on IoMemory instead of GuestMemory because we
want to prefer IoMemory with its permissions-including interface.

While this MR does provide a blanket implementation of IoMemory for all
GuestMemory, Rust isn’t fully transitive here, so just because we have a
blanket impl IoMemory for GuestMemory and a blanket impl Bytes for IoMemory
doesn’t really implicitly give us an impl Bytes for GuestMemory.

What this means can be seen in virtio-queue (in vm-virtio): It uses trait bounds
like M: GuestMemory only, but then expects to be able to use the Bytes
trait. This is no longer possible, the trait bound must be extended to
M: GuestMemory + Bytes or replaced by M: IoMemory (the latter is what we
want).

Guest Address Type

Another consideration is that I originally planned to introduce new address
types. GuestAddress currently generally refers to a guest physical address
(GPA); but we now also need to deal with I/O virtual addresses (IOVAs), and an
IOMMU generally doesn’t translate those into GPAs, but VMM user space addresses
(VUAs) instead, so now there’s three kinds of addresses. Ideally, all of those
should get their own type; but I felt like:

  • This would require too many changes from our users, and
  • You don’t even know whether the address you use on an IoMemory object is an
    IOVA or a GPA. It depends on whether the IOMMU is enabled or not, which is
    generally a runtime question.

Therefore, I kept GuestAddress as the only type, and it may refer to any of
the three kinds of addresses (GPAs, IOVAs, VUAs).

Async Accesses

I was considering whether to also make memory accesses optionally async. The
vhost-user IOMMU implementation basically needs two vhost-user socket roundtrips
per IOTLB miss, which can make guest memory accesses quite slow. An async
implementation could allow mitigating that.

However, I decided against it (for now), because this would also require
extensive changes in all of our consuming crates to really be useful: Anything
that does a guest memory access should then be async.

I think if we want to add this functionality later, it should be possible in a
compatible manner.

Changes Necessary in Other Crates

vm-virtio

Implementation: https://gitlab.com/hreitz/vm-virtio/-/commits/iommu

As stated above, places that bind M: GuestMemory but expect the Bytes trait
to also be implemented need to be changed to M: GuestMemory + Bytes or
M: IoMemory. I opted for the latter approach, and basically replaced all
GuestMemory instances by IoMemory.

(That is what we want because dropping GuestMemory in favor of IoMemory
ensures that all vm-virtio crates can work with virtual memory.)

vhost

Implementation: https://gitlab.com/hreitz/vhost/-/commits/iommu

Here, the changes that updating vm-memory necessitates are quite marginal, and
have a similar cause: But instead of requiring the Bytes trait, it’s the
GuestAddressSpace trait. The resolution is the same: Switch from requiring
GuestMemory to IoMemory.

The rest of the commits concerns itself with implementing VhostUserIommu and
allowing users to choose to use IommuMemory<GuestMemoryMmap, VhostUserIommu>
instead of only GuestMemoryMmap.

virtiofsd (as one user)

Implementation: https://gitlab.com/hreitz/virtiofsd-rs/-/commits/iommu

This is an example of an actual user. Updating all crates to IOMMU-supporting
versions actually does not require any changes to the code, but enabling the
'iommu' feature does: This feature makes the vhost-user-backend crate require
the VhostUserBackend::Memory associated type (because associated type defaults
are not stable yet), so this single line of code must be added (which sets the
type to GuestMemoryMmap<BitmapMmapRegion>).

Actually enabling IOMMU support is then a bit more involved, as it requires
switching away from GuestMemoryMmap to IommuMemory again.

However, to me, this shows that end users working with concrete types do not
seem to be affected by the incompatible IoMemory change until they want to opt
in to it. That’s because GuestMemoryMmap implements both GuestMemory and
IoMemory (thanks to the blanket impl), so can transparently be used wherever
the updated crates expect to see an IoMemory type.

Requirements

Before submitting your PR, please make sure you addressed the following
requirements:

  • All commits in this PR have Signed-Off-By trailers (with
    git commit -s), and the commit message has max 60 characters for the
    summary and max 75 characters for each description line.
  • All added/changed functionality has a corresponding unit/integration
    test.
  • All added/changed public-facing functionality has entries in the "Upcoming
    Release" section of CHANGELOG.md (if no such section exists, please create one).
  • Any newly added unsafe code is properly documented.

@XanClic
Copy link
Author

XanClic commented May 30, 2025

I know why the code coverage CI check fails, because I (purposefully) don’t have unit tests for the new code.

Why the other tests failed, I don’t know; but I suspect it’s because I force-pushed an update (fixing the CHANGELOG.md link) while the tests where running, maybe SIGTERM-ing them. Looking at the timelines, they all failed (finished) between 16:27:02.985 and 16:27:02.995. (Except the coverage one, which is an actual failure.)

@XanClic
Copy link
Author

XanClic commented May 30, 2025

Pushed an update without actual changes (just re-committing the top commit) to trigger a CI re-run. This time, only the coverage check failed (as expected).

XanClic added 6 commits July 10, 2025 13:27
With virtual memory, seemingly consecutive I/O virtual memory regions
may actually be fragmented across multiple pages in our userspace
mapping.  Existing `descriptor_utils::Reader::new()` (and `Writer`)
implementations (e.g. in virtiofsd or vm-virtio/virtio-queue) use
`GuestMemory::get_slice()` to turn guest memory address ranges into
valid slices in our address space; but with this fragmentation, it is
easily possible that a range no longer corresponds to a single slice.

To fix this, we can instead use `try_access()` to collect all slices,
but to do so, its region argument needs to have the correct lifetime so
we can collect the slices into a `Vec<_>` outside of the closure.

Signed-off-by: Hanna Czenczek <[email protected]>
read() and write() must not ignore the `count` parameter: The mappings
passed into the `try_access()` closure are only valid for up to `count`
bytes, not more.

Signed-off-by: Hanna Czenczek <[email protected]>
When we switch to a (potentially) virtual memory model, we want to
compact the interface, especially removing references to memory regions
because virtual memory is not just split into regions, but pages first.

The one memory-region-referencing part we are going to keep is
`try_access()` because that method is nicely structured around the
fragmentation we will have to accept when it comes to paged memory.

`to_region_addr()` in contrast does not even take a length argument, so
for virtual memory, using the returned region and address is unsafe if
doing so crosses page boundaries.

Therefore, switch `Bytes::load()` and `store()` from using
`to_region_addr()` to `try_access()`.

Signed-off-by: Hanna Czenczek <[email protected]>
The existing `GuestMemory` trait is insufficient for representing
virtual memory, as it does not allow specifying the required access
permissions.

Its focus on all guest memory implementations consisting of a relatively
small number of regions is also unsuited for paged virtual memory with a
potentially very lage set of non-continuous mappings.

The new `IoMemory` trait in contrast provides only a small number of
methods that keep the implementing type’s internal structure more
opaque, and every access needs to be accompanied by the required
permissions.

Signed-off-by: Hanna Czenczek <[email protected]>
Rust only allows us to give one trait the blanket implementations for
`Bytes` and `GuestAddressSpace`.

We want `IoMemory` to be our primary external interface becaue it has
users specify the access permissions they need, and because we can (and
do) provide a blanket `IoMemory` implementation for all `GuestMemory`
types.

Therefore, replace requirements of `GuestMemory` by `IoMemory` instead.

Signed-off-by: Hanna Czenczek <[email protected]>
The Iommu trait defines an interface for translating virtual addresses
into addresses in an underlying address space.

It is supposed to do so by internally keeping an instance of the Iotlb
type, updating it with mappings whenever necessary (e.g. when
actively invalidated or when there’s an access failure) from some
internal data source (e.g. for a vhost-user IOMMU, the data comes from
the vhost-user front-end by requesting an update).

In a later commit, we are going to provide an implementation of
`IoMemory` that can use an `Iommu` to provide an I/O virtual address
space.

Note that while I/O virtual memory in practice will be organized in
pages, the vhost-user specification makes no mention of a specific page
size or how to obtain it.  Therefore, we cannot really assume any page
size and have to use plain ranges with byte granularity as mappings
instead.

Signed-off-by: Hanna Czenczek <[email protected]>
@XanClic XanClic force-pushed the iommu branch 2 times, most recently from 5ee020b to 4104d5f Compare July 10, 2025 12:23
@XanClic XanClic changed the title [DRAFT] I/O virtual memory (IOMMU) support I/O virtual memory (IOMMU) support Jul 10, 2025
@XanClic XanClic marked this pull request as ready for review July 10, 2025 12:39
@XanClic
Copy link
Author

XanClic commented Jul 10, 2025

Added tests for the new functionality, and rebased on the main branch.

@XanClic
Copy link
Author

XanClic commented Jul 11, 2025

cc @germag

XanClic added 3 commits July 30, 2025 10:46
This `IoMemory` type provides an I/O virtual address space by adding an
IOMMU translation layer to an underlying `GuestMemory` object.

Signed-off-by: Hanna Czenczek <[email protected]>
The vhost-user-backend crate will need to be able to modify all existing
memory regions to use the VMM user address instead of the guest physical
address once the IOMMU feature is switched on, and vice versa.  To do
so, it needs to be able to modify regions’ base address.

Because `GuestMemoryMmap` stores regions wrapped in an `Arc<_>`, we
cannot mutate them after they have been put into the `GuestMemoryMmap`
object; and `MmapRegion` itself is by its nature not clonable.  So to
modify the regions’ base addresses, we need some way to create a new
`GuestRegionMmap` referencing the same `MmapRegion` as another one, but
with a different base address.

We can do that by having `GuestRegionMmap` wrap its `MmapRegion` in an
`Arc`, and adding a method to return a reference to that `Arc`, and a
method to construct a `GuestRegionMmap` object from such a cloned `Arc.`

Signed-off-by: Hanna Czenczek <[email protected]>
Without an IOMMU, we have direct access to guest physical addresses
(GPAs).  In order to track our writes to guest memory (during
migration), we log them into dirty bitmaps, and a page's bit index is
its GPA divided by the page size.

With an IOMMU, however, we no longer know the GPA, instead we operate on
I/O virtual addresses (IOVAs) and VMM user-space addresses (VUAs).
Here, the dirty bitmap bit index is the IOVA divided by the page size.

`IoMemory` types contain an internal "physical" memory type that
operates on these VUAs (`IoMemory::PhysicalMemory).  Any bitmap
functionality that this internal type may already have (e.g.
`GuestMemoryMmap` does) cannot be used for dirty bitmap tracking with an
IOMMU because they would use the VUA, but we need to use the IOVA, and
this information is not available on that lower layer.

Therefore, `IoMemory` itself needs to support bitmaps separately from
its inner `PhysicalMemory`, which will be used when the IOMMU is in use.
Add an associated `IoMemory::Bitmap` type and add a bitmap object to
`IommuMemory`.  Ensure that writes to memory dirty that bitmap
appropriately:
- In `try_access()`, if write access was requested, dirty the handled
  region of the bitmap after the access is done.
- In `get_slice()`, replace the `VolatileSlice`'s bitmap (which comes
  from the inner `PhysicalMemory`) by the correct slice of our IOVA
  bitmap before returning it.

Signed-off-by: Hanna Czenczek <[email protected]>
@XanClic
Copy link
Author

XanClic commented Jul 30, 2025

Added support for bitmaps in I/O virtual address space (required for migration).

XanClic added 3 commits July 30, 2025 18:58
This commit also adds the iommu feature to the coverage_config feature
list.  (I left the aarch64 coverage value unchanged; I cannot find out
how to get the current value on my system, and it isn’t include in CI.)

Signed-off-by: Hanna Czenczek <[email protected]>
Document in DESIGN.md how I/O virtual memory is handled.

Signed-off-by: Hanna Czenczek <[email protected]>
if len < expected {
return Err(Error::PartialBuffer {
expected,
completed: len,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

completed should be 0, since you didn't read anything. So here you could also return Ok(0) ("no more data") and let the if below return Error::PartialBuffer.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right!

if len < expected {
return Err(Error::PartialBuffer {
expected,
completed: len,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed.


/// Permissions for accessing virtual memory.
#[derive(Clone, Copy, Debug, Eq, PartialEq)]
pub enum Permissions {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about defining this like:

pub struct Permissions(u8);
impl Permissions {
    pub const No: Permissions = Permissions(0);
    pub const Read: Permissions = Permissions(1);
    pub const Write: Permissions = Permissions(2);
    pub const ReadWrite: Permissions = Permissions(3);
}

and implementing & and | as just a bitwise operation on x.0? (Basically reinventing the bitflags crate to avoid introducing a new dependency)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(and also reinventing it to hide the underlying implementation of the fake enum)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’m not dead-set; for bitfields, such a struct with constants can make more sense. On the other hand, I like using an enum if anything just to check that match is exhaustive (e.g. when we need to translate Permissions into VhostAccess).

FWIW, starting from opt-level = 2, this does get optimized properly:

   0:   48 89 f8                mov    rax,rdi
   3:   48 21 f0                and    rax,rsi
   6:   c3                      ret

As a compromise, I could do repr(u8) with bitand(x, y) { ((x as u8) | (y as u8)).try_into().unwrap() and an accompanying TryFrom<u8> implementation. Even on opt-level = 0, that would yield:

   0:   48 89 f8                mov    rax,rdi
   3:   48 89 44 24 f0          mov    QWORD PTR [rsp-0x10],rax
   8:   48 89 74 24 f8          mov    QWORD PTR [rsp-0x8],rsi
   d:   48 21 f0                and    rax,rsi
  10:   c3                      ret

(opt-level = 1 then also removes the stack accesses, as above)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the other hand, I like using an enum if anything just to check that match is exhaustive (e.g. when we need to translate Permissions into VhostAccess).

Ah, you're right. Exhaustiveness is important.

Let's go for the repr, but with a private panicking fn from_repr(x: u8 -> Self) instead of the TryFrom trait.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good!

}
}

impl<M: GuestMemory> IoMemory for M {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd consider just changing GuestMemory. Is anyone using try_access directly? It's not private, but almost so.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did consider that, and I put the reasoning why I decided not to do so (and indeed failed to do so) into the opening comment: The most immediate reason is that anyone accessing memory needs to specify the required access permissions. Yes, most accesses go through the Bytes interface, i.e. already specify the intended access, but not all of them:

Which gets me to the second question: There are users (like descriptor_utils used by virtiofsd or vm-virtio/virtio-queue) that currently use get_slice(). get_slice() is dangerous because it cannot cope with fragmented memory, which becomes especially apparent when using paged virtual memory. So in fact my current draft replaces those get_slice() usages by try_access().

We could of course replace get_slice() by something like get_slices() -> impl Iterator<Item = VolatileSlice>, but really, that wouldn’t offer more than the try_access() we already have, just replace the interior iterator by an exterior one. (Though I’m open to the idea.)

Besides the access mode, there’s also the general idea I outlined in the original comment that GuestMemory is quite open about how it is structured into a limited set of GuestMemoryRegion objects internally. For virtual memory, that’s no longer applicable: Pages and guest memory regions are not only conceptually different, access to single pages is also quite useless. (Whereas GuestMemoryRegion is fundamental to how the vhost code constructs the GuestMemory from the information it gets from the front-end, so it must be visible.)

If we continued to use GuestMemory, I’m afraid it will be impossible to verify at compile time that all the accesses are sound. For example, a user could try to access virtual memory’s GuestMemoryRegion objects, and that would only return an error at runtime.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's pretty convincing. The one thing I'd change however is to always expose IoMemory and only put IommuMemory/Iommu/Iotlb under the new feature.

This way, the Bytes<GuestAddress> implementation can simply move to IoMemory.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That’s how it already is, though. I’m happy to keep it that way 😅 but by putting IoMemory under the iommu feature, users could remain unchanged until they need/want IOMMU support.

(The problem is that the blanket impl Bytes<_> for IoMemory plus blanket impl IoMemory for GuestMemory does not auto-extend to Bytes<_> for GuestMemory, so with the introduction of IoMemory, users need to be changed to at least import IoMemory.)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you'd get it just by importing Bytes...

The problem with moving IoMemory to the new feature is that features are additive. You can't make the iommu feature pick whether Bytes<> is added to GuestMemory or IoMemory.

Unrelated to this, maybe we should bite the bait and add a prelude...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Testcase:

lib.rs:

pub trait GuestMemory {}
pub trait IoMemory {}

impl<T: GuestMemory> IoMemory for T {}

pub trait HelloWorld {
    fn hello_world(&self);
}

impl<T: IoMemory> HelloWorld for T {
     fn hello_world(&self) {}
}

test.rs:

use lib::{GuestMemory, HelloWorld};

pub struct Foo;
impl GuestMemory for Foo {}

fn main() {
    Foo.hello_world();
}
$ rustc --crate-type=rlib lib.rs --edition=2021
$ rustc test.rs --extern lib=liblib.rlib --edition=2021

Copy link
Author

@XanClic XanClic Aug 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting. It doesn’t work for vm-virtio/virtio-queue because that has a relaxed ?Sized requirement. Extending your example:

use lib::{GuestMemory, HelloWorld};

pub struct Foo;
impl GuestMemory for Foo {}

pub fn hello<T>(foo: &T)
where
    T: GuestMemory + ?Sized,
{
    foo.hello_world();
}

fn main() {
    hello(&Foo);
}

It works fine without the ?Sized.

Removing the ?Sized from virtio-queue and adding + Sized to <M as Deref>::Target: GuestMemory where that appears makes it work indeed. But I’m not sure whether they’ll be happy about that…

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great news! If I just add + ?Sized to the IoMemory for GuestMemory impl, it works:

pub trait IoMemory {
    type PhysicalMemory: GuestMemory + ?Sized;
    /* ... */
}

impl<M: GuestMemory + ?Sized> IoMemory for M {
    /* ... */
}

@@ -87,6 +93,8 @@ impl std::ops::BitAnd for Permissions {
pub trait IoMemory {
/// Underlying `GuestMemory` type.
type PhysicalMemory: GuestMemory;
/// Dirty bitmap type for tracking writes to the IOVA address space.
type Bitmap: Bitmap;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure I like the design here. The IOVA address space, being virtual, can have aliases. Could you still store the bitmap in the low-level memory region, but with accesses going through IOMMU translation?

What does vhost-user require? Are dirty bitmap accesses done in IOVA or GPA space?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are dirty bitmap accesses done in IOVA or GPA space?

They’re done in IOVA space, which is why I think we need the bitmap on this level.

(If we continued to store it in e.g. GuestRegionMmap/MmapRegion, it would need a reverse translation, and then keep a mapping of address ranges to bitmap slices instead of just a single bitmap slice linearly covering the entire region.)

Copy link
Member

@bonzini bonzini Aug 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They’re done in IOVA space, which is why I think we need the bitmap on this level.

I see. Maybe you could add a bitmap::BS<'a, Self::Bitmap> as another argument that try_access() passes back to the callback? And then IommuMemory::get_slice() can pass that argument to replace_bitmap().

This way, the Iommu controls whether dirty tracking is done in IOVA or GPA space.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mean putting the responsibility on the try_access() callback to dirty a bitmap slice given to it if it has written anything?

I’m not sure, that does sound more than reasonable, but would require more changes in the callers than just to add the Permissions flag. 🙂

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mean putting the responsibility on the try_access() callback to dirty a bitmap slice given to it if it has written anything?

No, I confused try_access and get_slice sorry. For IoMemory, dirtying is already done it in try_access itself; for Iommu the slice could be returned by translate, that is it would be part of the Iotlb entry?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’m sorry, I don’t quite follow; currently, dirtying is done only by IommuMemory (in its IoMemory implementation), neither IoMemory nor Iommu.

<IommuMemory as IoMemory>::try_access() dirties the bitmap itself, right.

<IommuMemory as IoMemory>::get_slice() has the VolatileSlice do the work. For this, it replaces the VolatileSlice’s internal bitmap slice (which is initially set by the GuestMemoryRegion) by the right slice in the IOVA space.

I don’t think the IOMMU object should do any of this, and I don’t think it should return slices (if I understand correctly). I think it should only do the translation and not care about the actual memory.

Copy link
Member

@bonzini bonzini left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall this is good stuff. It also has a lot of overlap with the experimental changes that were necessary in order to use vm-memory in QEMU, which is a very good thing.

I have two main questions:

  • do we really need an IoMemory or can we change GuestMemory?
  • if we need an IoMemory, I think the PhysicalMemory: GuestMemory type instead be an AS: GuestAddressSpace? (and likewise for physical_memory()?

I'm marking this as "request changes" because there would be changes either way.

Copy link
Member

@bonzini bonzini left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please extract the first three commits, up to "Bytes: Do not use to_region_addr()" in a separate PR.

@roypat
Copy link
Member

roypat commented Aug 11, 2025

+1 to changing GuestMemory. Could we maybe parameterized by the address space? E.g. GuestMemory<GPA> and GuestMemory<IOVA> (and then indeed introduce separate guest address types for guest physical and iommu addresses)?

@bonzini
Copy link
Member

bonzini commented Aug 11, 2025

and then indeed introduce separate guest address types for guest physical and iommu addresses

No, I don't think that is a good idea. IOVA and GPA are the same concept, though in different address spaces.

My hope when leaving the review was that no changes are needed outside vm-memory, and the interpretation of GuestAddress can be left to whoever creates the GuestMemory.

@XanClic
Copy link
Author

XanClic commented Aug 12, 2025

Could we maybe parameterized by the address space? E.g. GuestMemory and GuestMemory (and then indeed introduce separate guest address types for guest physical and iommu addresses)?

The main practical problem with using different types (as stated in the opening comment) is that users generally don’t even know whether a given address is a GPA or an IOVA. For example, the addresses you get from vrings and vrings descriptors can be either, it just depends on whether the IOMMU is enabled or not. Users should be able to use the same code whether it is on or off, which (I think) wouldn’t really be possible with different types.

Whether to add IoMemory or change GuestMemory

I hope Paolo will add a comment to that effect (we just had a talk); we agreed that adding IoMemory is OK as long as it is guarded under the new iommu feature.

I don’t like keeping GuestMemory because I think actually none of its current methods will work with virtual memory, and so I would prefer adding a new interface that makes it clear that these aren’t available at compile time.

(Paolo (maybe half-jokingly) suggested making the implementation of that a linking-time error by accessing undefined references in a hypothetical impl GuestMemory for IommuMemory implementation, which is fun, but I think just guarding the new trait under the iommu feature makes more sense.)

@bonzini
Copy link
Member

bonzini commented Aug 12, 2025

I don’t like keeping GuestMemory because I think actually none of its current methods will work with virtual memory, and so I would prefer adding a new interface that makes it clear that these aren’t available at compile time.

Yes, GuestMemory is a nice interface for implementation but clients should switch to IoMemory.

@roypat
Copy link
Member

roypat commented Aug 12, 2025

I don’t like keeping GuestMemory because I think actually none of its current methods will work with virtual memory, and so I would prefer adding a new interface that makes it clear that these aren’t available at compile time.

Yes, GuestMemory is a nice interface for implementation but clients should switch to IoMemory.

Different question about this: Do we really need the IoMemory trait, or could consumers just switch to IommuMemory<M: GuestMemory, I: Iommu> (e.g. be parametric by M and I instead of T: IoMemory, and for "no iommu" systems we can have a no-op implementation of Iommu)? That would also side-step all the problems of conflicting default implementations (although I'm not sure if the compiler will be clever enough to optimize the no-op iommu setup as well as with the special impl of IoMemory for GuestMemory :/)

@XanClic
Copy link
Author

XanClic commented Aug 12, 2025

could consumers just switch to IommuMemory<M: GuestMemory, I: Iommu> (e.g. be parametric by M and I instead of T: IoMemory, and for "no iommu" systems we can have a no-op implementation of Iommu)?

Doesn’t sound bad, but would be a more “radical change”: Intermediate users like virtio-queue currently use memory types via M: GuestMemory. Converting them to M: IoMemory is straightforward and allows their users to continue to use non-virtual memory types like GuestMemoryMmap with no change needed if they don’t need IOMMU functionality.

If we change patterns like foo<M: Deref<Target: GuestMemory>>(mem: M) to foo<M: Deref<Target = IommuMemory>>(mem: M), all users will need to be changed to use IommuMemory.

@roypat
Copy link
Member

roypat commented Aug 12, 2025

could consumers just switch to IommuMemory<M: GuestMemory, I: Iommu> (e.g. be parametric by M and I instead of T: IoMemory, and for "no iommu" systems we can have a no-op implementation of Iommu)?

Doesn’t sound bad, but would be a more “radical change”: Intermediate users like virtio-queue currently use memory types via M: GuestMemory. Converting them to M: IoMemory is straightforward and allows their users to continue to use non-virtual memory types like GuestMemoryMmap with no change needed if they don’t need IOMMU functionality.

If we change patterns like foo<M: Deref<Target: GuestMemory>>(mem: M) to foo<M: Deref<Target = IommuMemory>>(mem: M), all users will need to be changed to use IommuMemory.

Mh, potentially this wouldn't be that much churn, because we could have impl<M: GuestMemory> From<M> for IommuMemory<M, NoIommu> { ... }, and then users of, say, virtio-queue would only have to do a &(...).into() when passing memory. And having less traits in vm-memory would be a win for maintainability imo.

Somewhat related question, since I'm seeing the Deref bound there, if M becomes GuestAddressSpace, and the iommu is stored as I: Deref<Target=...> as outlined in #327 (comment), will there still be a need to put the IommuMemory behind a Deref itself?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants